Programming , Compilation , and Runtime Support for Processing - In - Memory Based Parallel Architectures ∗

نویسندگان

  • Jay B. Brockman
  • Hans P. Zima
چکیده

Processing-In-Memory (PIM) systems avoid the von-Neumann bottleneck in conventional machines by integrating high-density DRAM and CMOS logic on the same chip. Parallel systems based on this technology are expected to provide higher scalability, adaptability, robustness, fault tolerance, and lower power consumption than current MPPs or commodity clusters. Most current PIM-related research deals with hardware issues. In this paper, we outline the main ideas of a project intending to demonstrate that a high-level language approach can be successfully applied to massively parallel PIM-based architectures. We define a generalized abstract PIM architecture reflecting the characteristics of most current approaches to PIM, and develop new compilation and runtime technology to support optimizing translation from a Fortran 90 language extension to PIM assembly code.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shared memory multiprocessor support for functional array processing in SAC

Classical application domains of parallel computing are dominated by processing large arrays of numerical data. Whereas most functional languages focus on lists and trees rather than on arrays, SaC is tailor-made in design and in implementation for efficient high-level array processing. Advanced compiler optimizations yield performance levels that are often competitive with low-level imperative...

متن کامل

Towards Automatic Support of Parallel Sparse

In this paper, we present a generic matrix class in Java and a runtime environment with continuous compilations aiming to support automatic parallelization of sparse computations on distributed environments. Our package comes with a collection of matrix classes including operators of dense matrix, sparse matrix, and parallel matrix on distributed memory environments. In our environment, a progr...

متن کامل

Towards Efficient OpenMP Strategies for Non-Uniform Architectures

Memory Access (NUMA) based processors architectures. In these architectures, analyzing and considering the non-uniformity is of high importance for improving scalability of systems. In this paper, we analyze and develop a NUMA based approach for the OpenMP parallel programming model. Our technique applies a smart threads allocation method and an advanced tasks scheduling strategy for reducing r...

متن کامل

Runtime Support for Task Migration on Distributed Memory Architectures

The use of the task migration paradigm has been shown to allow efficient execution of unstructured codes on distributed-memory, parallel architectures. With this model, the data distributed on the parallel processors are never moved. When access to a non-local variable is necessary, the current computation is suspended and then resumed on the processor in charge of this variable. Our implementa...

متن کامل

A Compile - Time Openmp Cost Model

OpenMP is a de facto API for parallel programming in C/C++ and Fortran on shared memory and distributed shared memory platforms. It is also being increasingly used with MPI to form a hybrid programming model and is expected to be a promising candidate to exploit emerging multicore architectures. An OpenMP cost model is an analytical model that reflects the characteristics of OpenMP applications...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003